智能论文笔记

ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-Efficient Genome Analysis

Can Firtina , Kamlesh Pillai , Gurpreet S. Kalsi , Bharathwaj Suresh , Damla Senol Cali , Jeremie Kim , Taha Shahroodi , Meryem Banu Cavlak , Joel Lindegger , Mohammed Alser

分类：人工智能 | 机器学习

2022-07-20

剖面隐藏的马尔可夫模型（PHMM）广泛用于许多生物信息学应用中，以准确识别生物学序列（例如DNA或蛋白质序列）之间的相似性。 PHMM使用常用和高度精确的方法（称为Baum-Welch算法）来计算这些相似性。但是，Baum-Welch算法在计算上很昂贵，现有作品为固定的PHMM设计提供了软件或仅硬件解决方案。当我们分析最先进的作品时，我们发现迫切需要灵活，高性能和节能的硬件软件共同设计，以有效地有效地解决所有主要效率低下的效率PHMM的Baum-Welch算法。我们提出了APHMM，这是第一个灵活的加速框架，可以显着减少PHMM的Baum-Welch算法的计算和能量开销。 APHMM利用硬件软件共同设计来解决Baum-Welch算法中的主要效率低下，通过1）设计灵活的硬件来支持不同的PHMMS设计，2）利用可预测的数据依赖性模式，并使用chip Memory的片段记忆，使用纪念活动技术，memoigience Memoriques，Memoigience Memoriques，Memoigient， 3）通过基于硬件的过滤器快速消除可忽略的计算，4）最小化冗余计算。我们在专用硬件和2）GPU的软件优化方面实现了我们的1）硬件软件优化，以为PHMM提供首个灵活的Baum-Welch加速器。与Baum-Welch算法的CPU，GPU和FPGA实现相比，APHMM提供的显着加速度为15.55 x-260.03x，1.83x-5.34x和27.97倍，分别为27.97倍。 APHMM的表现优于三个重要的生物信息学应用程序的最新CPU实现，1）错误校正，2）蛋白质家族搜索和3）多个序列对齐，比1.29x-59.94x，1.03x-1.75x和分别为1.03x-1.95x。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

DroneAttention: Sparse Weighted Temporal Attention for Drone-Camera Based Activity Recognition

Santosh Kumar Yadav , Achleshwar Luthra , Esha Pahwa , Kamlesh Tiwari , Heena Rathore , Hari Mohan Pandey , Peter Corcoran

分类：计算机视觉

2022-12-07

Human activity recognition (HAR) using drone-mounted cameras has attracted considerable interest from the computer vision research community in recent years. A robust and efficient HAR system has a pivotal role in fields like video surveillance, crowd behavior analysis, sports analysis, and human-computer interaction. What makes it challenging are the complex poses, understanding different viewpoints, and the environmental scenarios where the action is taking place. To address such complexities, in this paper, we propose a novel Sparse Weighted Temporal Attention (SWTA) module to utilize sparsely sampled video frames for obtaining global weighted temporal attention. The proposed SWTA is comprised of two parts. First, temporal segment network that sparsely samples a given set of frames. Second, weighted temporal attention, which incorporates a fusion of attention maps derived from optical flow, with raw RGB images. This is followed by a basenet network, which comprises a convolutional neural network (CNN) module along with fully connected layers that provide us with activity recognition. The SWTA network can be used as a plug-in module to the existing deep CNN architectures, for optimizing them to learn temporal information by eliminating the need for a separate temporal stream. It has been evaluated on three publicly available benchmark datasets, namely Okutama, MOD20, and Drone-Action. The proposed model has received an accuracy of 72.76%, 92.56%, and 78.86% on the respective datasets thereby surpassing the previous state-of-the-art performances by a margin of 25.26%, 18.56%, and 2.94%, respectively.

translated by 谷歌翻译

Multi-Label Chest X-Ray Classification via Deep Learning

Aravind Sasidharan Pillai

分类：计算机视觉

2022-11-27

In this era of pandemic, the future of healthcare industry has never been more exciting. Artificial intelligence and machine learning (AI & ML) present opportunities to develop solutions that cater for very specific needs within the industry. Deep learning in healthcare had become incredibly powerful for supporting clinics and in transforming patient care in general. Deep learning is increasingly being applied for the detection of clinically important features in the images beyond what can be perceived by the naked human eye. Chest X-ray images are one of the most common clinical method for diagnosing a number of diseases such as pneumonia, lung cancer and many other abnormalities like lesions and fractures. Proper diagnosis of a disease from X-ray images is often challenging task for even expert radiologists and there is a growing need for computerized support systems due to the large amount of information encoded in X-Ray images. The goal of this paper is to develop a lightweight solution to detect 14 different chest conditions from an X ray image. Given an X-ray image as input, our classifier outputs a label vector indicating which of 14 disease classes does the image fall into. Along with the image features, we are also going to use non-image features available in the data such as X-ray view type, age, gender etc. The original study conducted Stanford ML Group is our base line. Original study focuses on predicting 5 diseases. Our aim is to improve upon previous work, expand prediction to 14 diseases and provide insight for future chest radiography research.

translated by 谷歌翻译

SWTF: Sparse Weighted Temporal Fusion for Drone-Based Activity Recognition

Santosh Kumar Yadav , Esha Pahwa , Achleshwar Luthra , Kamlesh Tiwari , Hari Mohan Pandey , Peter Corcoran

分类：计算机视觉

2022-11-10

Drone-camera based human activity recognition (HAR) has received significant attention from the computer vision research community in the past few years. A robust and efficient HAR system has a pivotal role in fields like video surveillance, crowd behavior analysis, sports analysis, and human-computer interaction. What makes it challenging are the complex poses, understanding different viewpoints, and the environmental scenarios where the action is taking place. To address such complexities, in this paper, we propose a novel Sparse Weighted Temporal Fusion (SWTF) module to utilize sparsely sampled video frames for obtaining global weighted temporal fusion outcome. The proposed SWTF is divided into two components. First, a temporal segment network that sparsely samples a given set of frames. Second, weighted temporal fusion, that incorporates a fusion of feature maps derived from optical flow, with raw RGB images. This is followed by base-network, which comprises a convolutional neural network module along with fully connected layers that provide us with activity recognition. The SWTF network can be used as a plug-in module to the existing deep CNN architectures, for optimizing them to learn temporal information by eliminating the need for a separate temporal stream. It has been evaluated on three publicly available benchmark datasets, namely Okutama, MOD20, and Drone-Action. The proposed model has received an accuracy of 72.76%, 92.56%, and 78.86% on the respective datasets thereby surpassing the previous state-of-the-art performances by a significant margin.

translated by 谷歌翻译

Document Image Binarization in JPEG Compressed Domain using Dual Discriminator Generative Adversarial Networks

Bulla Rajesh , Manav Kamlesh Agrawal , Milan Bhuva , Kisalaya Kishore , Mohammed Javed

分类：计算机视觉 | 人工智能 | 机器学习

2022-09-13

图像二进制技术通常用于增强嘈杂和/或退化的图像来迎合不同文档图像Anlaysis（DIA）应用（如单词斑点，文档检索和OCR）。大多数现有技术都集中在将像素图像馈送到卷积神经网络中以完成文档二进制化，这在使用不完全减压的情况下需要处理的压缩图像时可能不会产生有效的结果。因此，在本研究论文中，通过使用双重鉴别器生成对抗网络（DD-GAN），提出了使用JPEG压缩图像的文档图像二进制的想法。在这里，两个歧视者网络 - 全球和本地工作在不同的图像比率上，并将焦点损失用作发电机损失。提出的模型已通过不同版本的DIBCO数据集进行了彻底的测试，该数据集具有诸如孔，擦除或弄脏的墨水，灰尘和放错地方的挑战。在时间和空间复杂性方面，该模型被证明是高度鲁棒，有效的，并且还导致了JPEG压缩域中的最新性能。

translated by 谷歌翻译

A Hybrid Complex-valued Neural Network Framework with Applications to Electroencephalogram (EEG)

Hang Du , Rebecca Pillai Riddell , Xiaogang Wang

分类：机器学习

2022-07-28

在本文中，我们通过整合具有离散的傅立叶变换（DFT）的复杂值和实值卷积神经网络（CNN）来提出一个新的EEG信号分类框架。所提出的神经网络架构由一个复杂值的卷积层，两个实值卷积层和三个完全连接的层组成。我们的方法可以有效利用DFT中包含的相信息。我们使用两个模拟的EEG信号和一个基准数据集验证我们的方法，并将其与两个广泛使用的框架进行比较。与对基准数据集进行分类的现有方法相比，我们的方法大大减少了所使用的参数的数量并提高了准确性，并显着提高了对模拟的EEG信号进行分类的性能。

translated by 谷歌翻译

Multi-head Cascaded Swin Transformers with Attention to k-space Sampling Pattern for Accelerated MRI Reconstruction

Mevan Ekanayake , Kamlesh Pawar , Mehrtash Harandi , Gary Egan , Zhaolin Chen

分类：人工智能 | 计算机视觉 | 机器学习

2022-07-18

由于组织和骨骼之间的相似性，在人解剖结构中广泛看到了全球相关性。由于近距离质子密度和T1/T2参数，这些相关性反映在磁共振成像（MRI）扫描中。此外，为了实现加速的MRI，k空间数据的采样不足，从而导致全球混叠伪像。卷积神经网络（CNN）模型被广泛用于加速MRI重建，但是由于卷积操作的固有位置，这些模型在捕获全球相关性方面受到限制。基于自发的变压器模型能够捕获图像特征之间的全局相关性，但是，变压器模型对MRI重建的当前贡献是微小的。现有的贡献主要提供CNN转换器混合解决方案，并且很少利用MRI的物理学。在本文中，我们提出了一种基于物理的独立（无卷积）变压器模型，标题为“多头级联SWIN变压器（MCSTRA），用于加速MRI重建。 MCSTRA将几种相互关联的MRI物理相关概念与变压器网络相结合：它通过移动的窗口自我发场机制利用了全局MR特征；它使用多头设置分别提取属于不同光谱组件的MR特征；它通过级联的网络在中间脱氧和K空间校正之间进行迭代，该网络具有K空间和中间损耗计算中的数据一致性；此外，我们提出了一种新型的位置嵌入生成机制，以使用对应于底面采样掩码的点扩散函数来指导自我发作。我们的模型在视觉上和定量上都大大优于最先进的MRI重建方法，同时描述了改善的分辨率和去除词法。

translated by 谷歌翻译

PaLM: Scaling Language Modeling with Pathways

Aakanksha Chowdhery , Sharan Narang , Jacob Devlin , Maarten Bosma , Gaurav Mishra , Adam Roberts , Paul Barham , Hyung Won Chung , Charles Sutton , Sebastian Gehrmann

分类：自然语言处理

2022-04-05

大型语言模型已被证明可以使用少量学习来实现各种自然语言任务的出色表现，这大大减少了将模型调整到特定应用程序所需的特定任务培训示例的数量。为了进一步了解量表对少量学习的影响，我们培训了一个5400亿个参数，密集激活的变压器语言模型，我们称之为“途径”语言模型棕榈。我们使用Pathways在6144 TPU V4芯片上训练了Palm，这是一种新的ML系统，可在多个TPU POD上进行高效的训练。我们通过在数百种语言理解和产生基准的基准方面实现最先进的学习结果来证明扩展的持续好处。在这些任务中，Palm 540B实现了突破性的表现，在一系列多步推理任务上表现出色，超过了最新的最新表现，并且在最近发布的Big Benchmark上表现优于平均人类表现。大量的大型基础任务显示出与模型量表的不连续改进，这意味着当我们扩展到最大模型时，性能急剧增加。 Palm在多语言任务和源代码生成方面也具有很强的功能，我们在各种基准测试中证明了这一点。我们还提供了有关偏见和毒性的全面分析，并研究了训练数据记忆的程度，相对于模型量表。最后，我们讨论与大语言模型有关的道德考虑，并讨论潜在的缓解策略。

translated by 谷歌翻译

Cross-Domain Federated Learning in Medical Imaging

Vishwa S Parekh , Shuhao Lai , Vladimir Braverman , Jeff Leal , Steven Rowe , Jay J Pillai , Michael A Jacobs

分类：人工智能 | 计算机视觉 | 机器学习

2021-12-18

在医学成像领域越来越多地探索联合学习，以培训在不同数据中心分布在不同数据中心的大规模数据集上的深入学习模型，同时通过避免转移敏感患者信息来保护隐私。在此稿件中，我们在多域的多域的多任务设置中探索联合学习，其中不同的参与节点可以包含来自不同域的数据集，并训练以解决不同的任务。我们评估了两种不同实验设置的对象检测和分段任务的跨域联合学习：多模态和多器官。我们对跨领域联合学习框架的实验的结果非常令人鼓舞，对于器官定位，0.79的重叠相似性和0.65用于病变分割。我们的结果展示了在不共享来自不同域的数据的多域，多任务深度学习模型中联合学习的潜力。

translated by 谷歌翻译